PolarFormer: Multi-Camera 3D Object Detection with Polar Transformer

نویسندگان

چکیده

3D object detection in autonomous driving aims to reason “what” and “where” the objects of interest present a world. Following conventional wisdom previous 2D detection, existing methods often adopt canonical Cartesian coordinate system with perpendicular axis. However, we conjugate that this does not fit nature ego car’s perspective, as each onboard camera perceives world shape wedge intrinsic imaging geometry radical (non perpendicular) Hence, paper advocate exploitation Polar propose new Transformer (PolarFormer) for more accurate bird’s-eye-view (BEV) taking input only multi-camera images. Specifically, design cross-attention based head without restriction structure deal irregular grids. For tackling unconstrained scale variations along Polar’s distance dimension, further introduce multi-scale representation learning strategy. As result, our model can make best use rasterized via attending corresponding image observation sequence-to-sequence fashion subject geometric constraints. Thorough experiments on nuScenes dataset demonstrate PolarFormer outperforms significantly state-of-the-art alternatives.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-camera Multi-Object Tracking

In this paper, we propose a pipeline for multi-target visual tracking under multi-camera system. For multi-camera system tracking problem, efficient data association across cameras, and at the same time, across frames becomes more important than single-camera system tracking. However, most of the multi-camera tracking algorithms emphasis on single camera across frame data association. Thus in o...

متن کامل

A 3d Time of Flight Camera for Object Detection

The knowledge of three-dimensional data is essential for many control and navigation applications. Especially in the industrial and automotive environment a fast and reliable acquisition of 3D data has become a main requirement for future developments. Moreover low cost 3D imaging has the potential to open a wide field of additional applications and solutions in markets like consumer electronic...

متن کامل

Multi-Camera Collision Detection allowing for Object Occlusions

A multi-camera-based collision detection system is presented. We describe the computation of global collision information for the entire surveilled workspace based on local collision information extracted from camera images. If there are known occlusions (e.g., by the robot), the system is able to recover object collision information by fusing multiple camera images. The algorithm presented is ...

متن کامل

Object detection with single-camera stereo

Many fielded mobile robot systems have demonstrated the importance of directly estimating the 3D shape of objects in the robot’s vicinity. The most mature solutions available today use active laser scanning or stereo camera pairs, but both approaches require specialized and expensive sensors. In prior publications, we have demonstrated the generation of stereo images from a single very low-cost...

متن کامل

3D Object Detection with Kinect

1. Abstract The goal of our project is to develop a general machine learning framework for classifying objects based on RGBD point cloud data from a Kinect. Using this framework, a robot equipped with a Kinect will take the name of an object as input, scan its surroundings, and move to the most likely matching object that it finds. As a proof of concept, we demonstrate our algorithm on an offic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i1.25185